Informed algorithms for sound source separation in enclosed reverberant environments

نویسنده

  • Muhammad Salman Khan
چکیده

While humans can separate a sound of interest amidst a cacophony of contending sounds in an echoic environment, machine-based methods lag behind in solving this task. This thesis thus aims at improving performance of audio separation algorithms when they are “informed” i.e. have access to source location information. These locations are assumed to be known a priori in this work, for example by video processing. Initially, a multi-microphone array based method combined with binary time-frequency masking is proposed. A robust least squares frequency invariant data independent beamformer designed with the location information is utilized to estimate the sources. To further enhance the estimated sources, binary time-frequency masking based post-processing is used but cepstral domain smoothing is required to mitigate musical noise. To tackle the under-determined case and further improve separation performance at higher reverberation times, a two-microphone based method which is inspired by human auditory processing and generates soft timefrequency masks is described. In this approach interaural level difference, interaural phase difference and mixing vectors are probabilistically modeled in the time-frequency domain and the model parameters are learned through the expectation-maximization (EM) algorithm. A direction vector is estimated for each source, using the location information, which is used as the mean parameter of the mixing vector model. Soft time-frequency masks are used to reconstruct the sources. A spatial covariance model is then in-

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting the Self-Steering Capability of Blind Source Separation to Localize Two or More Sound Sources in Adverse Environments

Blind Source Separation (BSS) algorithms have often been interpreted as a set of blind adaptive beamformers. Although this interpretation does not entirely hold under realistic conditions, it gives some useful insights on the self-steering capacity of BSS techniques. Actually, while accurate source location information is usually necessary to steer a beamformer, BSS offers the possibility to re...

متن کامل

Binaural Source Separation in Non-ideal Reverberant Environments

This paper proposes a framework for separating several speech sources in non-ideal, reverberant environments. A movable human dummy head residing in a normal office room is used to model the conditions humans experience when listening to complex auditory scenes. Before the source separation takes place the human dummy head explores the auditory scene and extracts characteristics the same way as...

متن کامل

Estimation of fundamental frequency of reverberant speech by utilizing complex cepstrum analysis

This paper reports comparative evaluations of twelve typical methods of estimating fundamental frequency (F0) over huge speech-sound datasets in artificial reverberant environments. They involve several classic algorithms such as Cepstrum, AMDF, LPC, and modified autocorrelation algorithms. Other methods involve a few modern instantaneous amplitudeand/or frequency-based algorithms, such as STRA...

متن کامل

Integrating Monaural and Binaural Cues for Sound Localization and Segregation in Reverberant Environments

The problem of segregating a sound source of interest from an acoustic background has been extensively studied due to applications in hearing prostheses, robust speech/speaker recognition and audio information retrieval. Computational auditory scene analysis (CASA) approaches the segregation problem by utilizing grouping cues involved in the perceptual organization of sound by human listeners. ...

متن کامل

A real-time blind source separation scheme and its application to reverberant and noisy acoustic environments

In this paper, we present an efficient real-time implementation of a broadband algorithm for blind source separation (BSS) of convolutive mixtures. A recently introduced generic BSS framework based on a matrix formulation allows simultaneous exploitation of nonwhiteness and nonstationarity of the source signals using second-order statistics. We demonstrate here that this general scheme leads to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016